AITopics

Country: North America (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsFeb-10-2026, 21:03:49 GMT

GeneralizationPropertiesofNASunderActivation andSkipConnectionSearch

Despite the progress achieved with NAS, so far there is little attention to theoretical guarantees on NAS.

architecture search, artificial intelligence, machine learning, (17 more...)

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Switzerland (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Neural Information Processing SystemsFeb-7-2026, 12:44:58 GMT

1160453108d3e537255e9f7b931f4e90-Paper.pdf

If we know beforehand that the problem is periodic, we can easily solve it, e.g., in Fourier space, orafteranappropriate transformation.

activation function, artificial intelligence, machine learning, (18 more...)

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.06)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
North America > United States (0.05)
(4 more...)

Industry:

Banking & Finance (0.69)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.73)

Lan, Qingfeng, Vasan, Gautham, Mahmood, A. Rupam

Efficient Reinforcement Learning by Reducing Forgetting with Elephant Activation Functions

arXiv.org Artificial IntelligenceSep-24-2025

Catastrophic forgetting has remained a significant challenge for efficient reinforcement learning for decades (Ring 1994, Rivest and Precup 2003). While recent works have proposed effective methods to mitigate this issue, they mainly focus on the algorithmic side. Meanwhile, we do not fully understand what architectural properties of neural networks lead to catastrophic forgetting. This study aims to fill this gap by studying the role of activation functions in the training dynamics of neural networks and their impact on catastrophic forgetting in reinforcement learning setup. Our study reveals that, besides sparse representations, the gradient sparsity of activation functions also plays an important role in reducing forgetting. Based on this insight, we propose a new class of activation functions, elephant activation functions, that can generate both sparse outputs and sparse gradients. We show that by simply replacing classical activation functions with elephant activation functions in the neural networks of value-based algorithms, we can significantly improve the resilience of neural networks to catastrophic forgetting, thus making reinforcement learning more sample-efficient and memory-efficient.

artificial intelligence, deep learning, machine learning, (15 more...)

2509.19159

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceSep-3-2025

Mask-PINNs: Mitigating Internal Covariate Shift in Physics-Informed Neural Networks

Jiang, Feilong, Hou, Xiaonan, Ye, Jianqiao, Xia, Min

Physics-Informed Neural Networks (PINNs) have emerged as a powerful framework for solving partial differential equations (PDEs) by embedding physical laws directly into the loss function. However, as a fundamental optimization issue, internal covariate shift (ICS) hinders the stable and effective training of PINNs by disrupting feature distributions and limiting model expressiveness. Unlike standard deep learning tasks, conventional remedies for ICS -- such as Batch Normalization and Layer Normalization -- are not directly applicable to PINNs, as they distort the physical consistency required for reliable PDE solutions. To address this issue, we propose Mask-PINNs, a novel architecture that introduces a learnable mask function to regulate feature distributions while preserving the underlying physical constraints of PINNs. We provide a theoretical analysis showing that the mask suppresses the expansion of feature representations through a carefully designed modulation mechanism. Empirically, we validate the method on multiple PDE benchmarks -- including convection, wave propagation, and Helmholtz equations -- across diverse activation functions. Our results show consistent improvements in prediction accuracy, convergence stability, and robustness. Furthermore, we demonstrate that Mask-PINNs enable the effective use of wider networks, overcoming a key limitation in existing PINN frameworks.

artificial intelligence, deep learning, machine learning, (16 more...)

2505.06331

Country:

Europe (0.46)
North America > Canada (0.28)

Genre: Research Report > New Finding (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsAug-17-2025, 02:34:29 GMT

952b691c116bf753daafa6ce274e81bb-Supplemental-Conference.pdf

activation function, eigenvalue, probability, (16 more...)

Country: North America (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsAug-17-2025, 02:34:25 GMT

Generalization Properties of NAS under Activation and Skip Connection Search

Neural Architecture Search (NAS) has fostered the automatic discovery of state-of-the-art neural architectures. Despite the progress achieved with NAS, so far there is little attention to theoretical guarantees on NAS.

artificial intelligence, deep learning, machine learning, (16 more...)

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Switzerland (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Gonzalez, Ian Mateos, Nava, Estefani Jaramilla, Morales, Abraham Sánchez, García-Ramírez, Jesús, Ramos-Aguilar, Ricardo

Lightweight Deep Models for Dermatological Disease Detection: A Study on Instance Selection and Channel Optimization

arXiv.org Artificial IntelligenceApr-1-2025

The identification of dermatological disease is an important problem in Mexico according with different studies. Several works in literature use the datasets of different repositories without applying a study of the data behavior, especially in medical images domain. In this work, we propose a methodology to preprocess dermaMNIST dataset in order to improve its quality for the classification stage, where we use lightweight convolutional neural networks. In our results, we reduce the number of instances for the neural network training obtaining a similar performance of models as ResNet.

artificial intelligence, deep learning, machine learning, (17 more...)

2504.01208

Country: North America > Mexico > Tlaxcala > Tlaxcala (0.04)

Genre: Research Report > New Finding (0.90)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.70)
Health & Medicine > Therapeutic Area > Dermatology (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

arXiv.org Artificial IntelligenceJan-13-2025

Performance Optimization of Ratings-Based Reinforcement Learning

Rose, Evelyn, White, Devin, Wu, Mingkang, Lawhern, Vernon, Waytowich, Nicholas R., Cao, Yongcan

This paper explores multiple optimization methods to improve the performance of rating-based reinforcement learning (RbRL). RbRL, a method based on the idea of human ratings, has been developed to infer reward functions in reward-free environments for the subsequent policy learning via standard reinforcement learning, which requires the availability of reward functions. Specifically, RbRL minimizes the cross entropy loss that quantifies the differences between human ratings and estimated ratings derived from the inferred reward. Hence, a low loss means a high degree of consistency between human ratings and estimated ratings. Despite its simple form, RbRL has various hyperparameters and can be sensitive to various factors. Therefore, it is critical to provide comprehensive experiments to understand the impact of various hyperparameters on the performance of RbRL. This paper is a work in progress, providing users some general guidelines on how to select hyperparameters in RbRL.

machine learning, rbrl, reinforcement learning, (14 more...)

2501.07755

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Lamprou, Zenon, Tenedios, Iakovos, Moshfeghi, Yashar

On the Role of Activation Functions in EEG-To-Text Decoder

arXiv.org Artificial IntelligenceOct-16-2024

In recent years, much interdisciplinary research has been conducted exploring potential use cases of neuroscience to advance the field of information retrieval. Initial research concentrated on the use of fMRI data, but fMRI was deemed to be not suitable for real-world applications, and soon, research shifted towards using EEG data. In this paper, we try to improve the original performance of a first attempt at generating text using EEG by focusing on the less explored area of optimising neural network performance. We test a set of different activation functions and compare their performance. Our results show that introducing a higher degree polynomial activation function can enhance model performance without changing the model architecture. We also show that the learnable 3rd-degree activation function performs better on the 1-gram evaluation compared to a 3rd-degree non-learnable function. However, when evaluating the model on 2-grams and above, the polynomial function lacks in performance, whilst the leaky ReLU activation function outperforms the baseline.

artificial intelligence, deep learning, machine learning, (17 more...)

2410.12572

Country:

Europe > Switzerland (0.05)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.05)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)